Domain motions, dimerization, and membrane interactions of the murine guanylate binding protein 2

Guanylate-binding proteins (GBPs) are a group of GTPases that are induced by interferon-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma$$\end{document}γ and are crucial components of cell-autonomous immunity against intracellular pathogens. Here, we examine murine GBP2 (mGBP2), which we have previously shown to be an essential effector protein for the control of Toxoplasma gondii replication, with its recruitment through the membrane of the parasitophorous vacuole and its involvement in the destruction of this membrane likely playing a role. The overall aim of our work is to provide a molecular-level understanding of the mutual influences of mGBP2 and the parasitophorous vacuole membrane. To this end, we performed lipid-binding assays which revealed that mGBP2 has a particular affinity for cardiolipin. This observation was confirmed by fluorescence microscopy using giant unilamellar vesicles of different lipid compositions. To obtain an understanding of the protein dynamics and how this is affected by GTP binding, mGBP2 dimerization, and membrane binding, assuming that each of these steps are relevant for the function of the protein, we carried out standard as well as replica exchange molecular dynamics simulations with an accumulated simulation time of more than 30 μs. The main findings from these simulations are that mGBP2 features a large-scale hinge motion in its M/E domain, which is present in each of the studied protein states. When bound to a cardiolipin-containing membrane, this hinge motion is particularly pronounced, leading to an up and down motion of the M/E domain on the membrane, which did not occur on a membrane without cardiolipin. Our prognosis is that this up and down motion has the potential to destroy the membrane following the formation of supramolecular mGBP2 complexes on the membrane surface.

used for realizing the MD simulations. The original GTP parameters apart from the atomic charges were taken from the work of Meagher et al. 12 and incorporated into GROMACS following the procedure by Gao and Eriksson 13 . The partial atomic charges of GTP had to be determined to agree with the Amber99SB*-ILNDP charge assignment philosophy. For this, we followed the procedure that was used for the parametrization of the Amberff99SB-ILDN force field, which employed the restrained electrostatic potential (RESP) method 14,15 based on a Hartree-Fock calculation, using a 6-31G* basis set, of geometry-optimized GTP. The geometry optimization was realized with a density functional-theory calculation with a B3LYP functional and a 6-31G* basis set. These quantum chemical calculations were performed using the Gaussian 09 program 16 , following the recommendations of the Department of Theoretical Chemistry of the Lund University (available at http: //www.teokem.lu.se/ulf/Methods/resp.html) and being in line with procedures described elsewhere 17,18 . The complete parameter file for use in GROMACS was built by combining the derived RESP charges and the GTP parameters available in AMBER frcmod file format 12 via the antechamber 19,20 and tleap 21 tools available in the AmberTools15 software package. The resulting prmtop and prmcrd files were converted to GROMACS topology and coordinate files using the ACPYPE (Ante Chamber PYthon Parser interface) tool 22 .
To obtain a protein model of GTP-bound mGBP2, we made use of the crystal structure of the G domain of hGBP1 co-crystallized with the GTP-analogue GppNHp (PDB ID 2BC9) 23 . For homology modeling, the same procedure as explained above was employed, resulting in a homology model of the G domain of mGBP2, to which the M and E domain of the mGBP2 apo model had to be added. Here, care had to be taken to avoid clashes between the loop involving residues 155-170 and the α12-helix. This was accomplished by slightly stretching the bond between the C and C α -atoms of residue 481, which was enough to move the α12/13-helices sufficiently away from the G domain. All these transformations were done using the VMD software 24 . Finally, the nucleotide GTP and the cation Mg 2+ as co-factor had to be added. To do so, the model was further processed with the free version of the Maestro program 25 . The structure model was loaded together with the superposed coordinates of Mg 2+ and GDPxAlF 3 as available in the crystal structure of the G domain of hGBP1 co-crystalized with GDP-AlF 3 (PDB ID 2B92) 23 . To convert GDP-AlF 3 to GTP, AlF 3 was removed and a phosphate group was attached to the β -phosphate of the existing GDP. The γ-phosphate was added in a way as to avoid atom clashes and be in a reasonable position relative to important protein residues, in particular to K51 and the Mg 2+ co-factor, as described by Kravets 5 . The GTP structure was protonated to mimic the physiological pH value of 7.4, making use of the protonation state of GTP at pH 7.0 as available in the NMR-determined PDB structure with code 2KSQ 26 , giving GTP an overall charge of −4. It was then pre-optimized according to the protein and co-factor environment within Maestro, which makes use of the OPLS 2005 force field 27 . The obtained coordinates defined the input structure for the simulations of GTP-bound mGBP2, denoted as mGBP2 GTP in the paper. mGBP2 model involving the geranylgeranyl lipid anchor. Similar to GTP, the force field parameters of the geranylgeranyl group together with C586, i.e., the residue to which it being attached, were generated using the Antechamber program 19, 20 of 2/20 the AmberTools15 software package 28 and ACPYPE 22 . The electron density calculations were performed using Gaussian 09 16 with the basis set 6-31G* at the Hartree-Fock (HF) level of theory. Partial charges were derived using the RESP method 14,15 following geometry optimization and the electrostatic potential calculations at the HF/6-31G* level. The resulting model involving both GTP in the G domain and the geranylgeranyl lipid anchor at the C-terminus represent the holo-state of mGBP2, thus denoted as mGBP2 holo . mGBP2 dimer model. To build the mGBP2 holo dimer, we superimposed two mGBP2 holo monomers on an energy-minimized structure of the hGBP1 dimer created by Barz et al. 29 using PyMOL 30 . This dimer structure was first equilibrated for 100 ns to test its stability and was found to be stable. mGBP2 monomer on a membrane. For modeling membrane-bound mGBP2, we started by building a symmetric POPC (1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine) membrane, involving 1,418 lipids; this was accomplished with the CHARMM-GUI webserver 31,32 . To determine the influence of GTP on membrane-bound mGBP2, we created two systems: mGBP2 as mGBP2 holo and without GTP, both being placed on the membrane (denoted as mGBP2 GTP mem and mGBP2 noGTP mem , respectively). The geranylgeranyl lipid anchor was inserted into the lipid bilayer by replacing one POPC molecule with the geranylgeranyl group, followed by a short energy minimization in vacuum to remove atom clashes. The parameters for POPC were taken from the Slipids force field 33,34 .
To model a more realistic membrane, we employed CHARMM-GUI to create a symmetric membrane composed of three lipid types, namely 75% 1,2-dioleoyl-sn-glycero-3-phosphocholine (DOPC), 20% cholesterol, and 5% cardiolipin (CL), using experimental data as input (see below). The GTP-containing mGBP2 was placed on the membrane and anchored to it by inserting its geranylgeranyl group into the membrane, following the same protocol as for the POPC membrane systems. This system is denoted as mGBP2 GTP CL-mem , and it was simulated using the CHARMM36 force field 35,36 . The parameters for the geranylgeranyl group were derived using the CgenFF 37, 38 at https://cgenff.umaryland.edu/. The latest parameters for the protein, GTP, and lipids were downloaded from the http://mackerell.umaryland.edu/charmm_ff.shtml#gromacs webserver.

MD simulation details
GROMACS 2016 or later versions 39,40 were employed for running all MD simulations.

HREMD simulations of mGBP2 monomer in solution.
In order to identify the role of GTP and the geranylgeranyl group on the mGBP2 dynamics, we performed Hamilitonian replica exchange MD simulations for mGBP2 apo , mGBP2 GTP , and mGBP2 holo . To this end, we placed each of the proteins in a rectangle simulation box of 8.5 nm × 9.0 nm× 17.0 nm and added 40,000 water molecules as well as 11, 13, and 12 Na + , respectively, for neutralization, resulting in around~130,000 atoms.
First, an energy minimization was performed using a steepest descent algorithm, followed by an N pT (with N as the number of atoms, p = 1 bar as the pressure, T = 310 K the temperature) position-restraint MD simulation of 500 ps in which the whole protein was restrained with a force constant of 10 kJ mol −1 Å −2 to equilibrate the solvent around the protein. Afterwards, the protein was slowly heated up to 310 K within a 500 ps MD simulation. Subsequently, a 20 ns N pT equilibration was conducted, where we applied position restraints to the rigid β -sheets of the G domain to avoid overall protein rotation and translation. This allowed us to use a smaller simulation box and thus reduce the computational requirements, considering the substantial size of the protein and its elongated shape 29 .
Next, we performed an HREMD simulation 41 with 16 replicas for each system. The energy function of mGBP2 including mGBP2-water interactions was modified in each replica by applying biasing factors of 310 K/T with the 16 temperatures T exponentially distributed between 310 and 370 K. This implies one unbiased replica, the so-called target replica at 310 K. This resulted in an average exchange probability between the neighbored replicas of~50%. Each replica simulation was 200 ns long, leading to a total of 3.2 µs for the 16 replicas. To ensure that GTP stayed in its binding pocket, three distance restraints between GTP and K51, R48, and Y53 were applied using the pull code of GROMACS. For comparison, we also performed an HREMD 3/20 simulation of mGBP2 holo with 40 replicas involving a temperature range between 310 and 520 K and 400 ns per replica, amounting to 16 µs for that simulation. The HREMD simulations were conducted with GROMACS 2016.4 in combination with the PLUMED plugin (version 2.4.1 from https://github.com/GiovanniBussi/plumed2/tree/v2.4) 42 .
In all HREMD simulations, we used the velocity rescaling thermostat and the isotropic Parrinello-Rahman barostat 43 .
Electrostatic interactions were treated with the particle-mesh Ewald (PME) method 44,45 in conjunction with periodic boundary conditions and a real-space cutoff of 10 Å. The Lennard-Jones interactions were cut at 10 Å. The leapfrog stochastic dynamics integrator was used for the integration of equations of motion and the LINCS algorithm 46 for constraining all bond lengths. In the mGBP2 apo system, virtual interaction sites were employed, permitting an integration time step of 4 fs while maintaining energy conservation 47 . For mGBP2 GTP and mGBP2 holo , a time step of 2 fs was applied. Coordinates and velocities were saved every 10 ps.

MD simulations of mGBP2 holo as monomer and dimer in solution.
For the monomeric mGBP2 holo , we created the same system as for the HREMD simulation. The mGBP2 holo dimer was solvated with 184,903 water molecules and the system neutralized with 24 Na + , resulting in a total number of 573,673 atoms and a rectangular simulation box of 13.5 nm × 16.0 nm × 27.0 nm.
The energy of both systems was first minimized using a steepest descent algorithm, followed by equilibration of the systems to the desired temperature of 310 K and pressure of 1 bar to mimic the physiological environment. First, a 0.1 ns NV T equilibration was performed (V being the box volume), followed by a 1 ns N pT equilibration to adjust the pressure. During these steps, the protein's heavy atoms were restrained with a force constant of 10 kJ mol −1 Å −2 , allowing the water molecules to relax around the solute. The velocity rescaling thermostat was employed to regulate the temperature in the NV T simulations, while the Nosé-Hoover thermostat 48,49 and the isotropic Parrinello-Rahman barostat 43 were used for the N pT simulations. The PME method 44,45 was used to calculate the electrostatic interactions with periodic boundary conditions. The cutoff for the van der Waals interactions and Coulombic interaction calculated in real space were set to 12 Å. The LINCS algorithm 46 was used to constrain all bond-lengths during the simulations. Production MD runs were performed for 1 µs for each system with position restraints on the rigid β -sheets of the G domain. The time step for integration was set to 2 fs, and the coordinates and velocities were saved every 20 ps.
MD simulations of membrane-inserted mGBP2. For the mGBP2 GTP mem and mGBP2 noGTP mem systems, we solvated the POPC membrane containing the geranylgeranyl-anchored protein, added 0.1 M NaCl as well as additional Na + to neutralize the overall charge of the system. The resulting box of 21 nm × 21 nm × 15 nm contains mGBP2 GTP mem /mGBP2 noGTP mem as well as 1,417 POPC lipids, 437/435 Na + and 425 Clions, and water, resulting in a total of 662,603/659,946 atoms. To minimize the energy, equilibrate the systems and perform the 1 µs production runs, the same protocol as for the MD simulations of the mGBP2 holo monomer and dimer was used, with the exception that a semi-isotropic Parrinello-Rahman barostat 43 was employed which is common for membrane systems. Moreover, no positional restraints were applied to the G domain, as the protein is already anchored. The same MD protocol was applied to the mGBP2 GTP CL-mem system. It involved a simulation box of size 19.8 nm × 19.8 nm × 17 nm containing 1,020 DOPC, 272 cholesterol and 68 cardiolipin lipids, 158,802 water molecules, 549 Na + and 401 Clions, which is equivalent to 664,034 particles including the protein.

7/20
Supplementary Figure S3. Transition networks of (a) mGBP2 apo and (b) and mGBP2 holo . The TNs were calculated using three descriptors: (i) the presence of three salt bridges (SB) between the G and E domain (R225-E554, R225-E561, K226-E573); (ii) the RMSD of the guanine cap (RMSD GC ); (iii) the distance of the C α atom of L480 between the MD and the start structure (d 480 ). The states of the TNs are arranged such that major d 480 changes occur along the x-axis and major RMSD GC changes along the y-axis. The size of the nodes reflect the population of the corresponding state. The networks are further divided into macrostates, which were determined using the modularity class feature of Gephi that identifies local communities of highly interconnected states. The average and minimum/maximum values of the descriptors per macrostate are provided as [SB, RMSD GC , d 480 ]. In addition, for each macrostate, a representative conformation was extracted and its structural elements underlying descriptor definitions (i) to (iii) (from left to right) are shown along with the starting structure (for RMSD GC and d 480 , in gray) to illustrate the movement in question. The salt bridges and guanine cap are labeled, the C α atom of L480 is shown as red sphere, and the d 480 value is given.

8/20
Supplementary Figure S4 where it is compared with mGBP2 GTP CL-mem (cyan), mGBP2 GTP mem (brown) and mGBP2 noGTP mem (orange). All important motifs, loops and helices are labeled and a background color added using the same colors as for the corresponding structural units as in Fig. 1.

10/20
Supplementary Figure S6. The five most likely mGBP2 dimer models as predicted by AlphaFold-Multimer. The dimer models are colored based on a per-residue estimate of the prediction's confidence (called pLDDT) on a scale from 0-100. Regions with pLDDT > 90 are shown in blue and expected to be modeled to high accuracy. Regions with pLDDT between 70 and 90 are expected to be modeled well (a generally good backbone prediction), while the predictions for regions with pLDDT between 50 and 70 are of low confidence and should be treated with caution. The 3D coordinates of regions with pLDDT < 50 should not be interpreted. Such low pLDDT values are a strong predictor of disorder or that the region in question is only structured as part of a complex. The G and M domains in the AlphaFold2-Multimer models are predicted with high confidence, while it is lower for the E domain. The deviation from the mGBP2 dimer that we constructed based on the X-ray structure of the G-domain dimer of hGBP1 (shown as gray cartoon) is provided in terms of the RMSD of the C α atoms.

11/20
Supplementary Figure S7. Interaction interface of the mGBP2 holo dimer. (a) All residues within 12 Å of the dimer interface are highlighted in dark teal for monomer M1 of the dimer and dark red for monomer M2 of the dimer. These residues are mainly from the G domain and α12/13 of the E domain. GTP is shown as gray spheres, while the parts of the G domain and α12/13 that are not involved in the protein-protein interactions are represented as transparent gray cartoon. (b) List of the residues involved in the interactions. They are combined into eight groups, which are used as labels in (c) where the probabilities of residue-residue contacts between M1 and M2 are shown. The probabilities were calculated considering contacts with a maximum distance of 5 Å between the residues. They are shown as a heat map, in which the depth of the blue color represents the height of the contact probability between 0 and 1. The nature of the contacts is provided, distinguishing between hydrogen bonds (HB), salt bridges (SB), interactions between hydrophobic residues (HP), interactions between polar residues (PL), π-π interactions (PP), and residues that are close to each other but do not belong to any of the special interactions (CL). The consecutive superscripts indicate symmetric interactions.  Supplementary Table S1. Structural elements and function of the important motifs, loops and helices for all three domains of mGBP2. Abbreviations: P-L = phosphate-binding loop, SW1/2 = switch 1/2, L1/2 = loop 1/2, G4 = G4 motif X(V/L)RD, GC = guanine cap, GG = geranylgeranyl.

Motif
Sequence If the residue belongs to a specific motif or helix, then it is written in the brackets. c The nature of the interactions is provided, distinguishing between hydrogen bonds (HB), salt bridges (SB), interactions between hydrophobic residues (HP), interactions between polar residues (PL), and π-π interactions (PP). To further specify the interactions, we also used the difference between backbone (BB) and side chain (SC), where the order is "type of interaction-M1(BB or SC)-M2(BB or SC)". d Very stable interactions are present the whole time (contact probability (CP) ≥ 0.9), while stable interactions are present in at least half of the simulation time (0.5 ≤ CP < 0.9). Unstable interactions are in the range of 0.25 ≤ CP < 0.5, whereas very unstable implies CP < 0.25. The probabilities were calculated considering contacts with a maximum distance of 5 Å between the residues.

16/20
Supplementary Table S5. Average motions of residue L480 at the M/E tip with respect to the MD starting structure as well as motions of residue groups with respect to the membrane surface in 1 µs MD simulations of mGBP2 noGTP mem and mGBP2 GTP mem . The motions of L480 are provided as Cartesian displacements ∆x, ∆y and ∆z as well as its distance from the starting structure. The motions with respect to the membrane surface are given by ∆z values of the centers of mass of the respective residue group, where z is the direction of the membrane normal and z = 0 defines the membrane surface given by the lipid headgroups. a The mean ± standard error of the mean as well as the minimal to maximal values in brackets below are listed.